Overview

Dataset statistics

Number of variables11
Number of observations569
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory49.0 KiB
Average record size in memory88.2 B

Variable types

NUM10
CAT1

Warnings

MeanPerimeter is highly correlated with MeanRadius and 1 other fieldsHigh correlation
MeanRadius is highly correlated with MeanPerimeter and 1 other fieldsHigh correlation
MeanArea is highly correlated with MeanRadius and 1 other fieldsHigh correlation
MeanConcavePoints is highly correlated with MeanConcavityHigh correlation
MeanConcavity is highly correlated with MeanConcavePointsHigh correlation
MeanConcavity has 13 (2.3%) zeros Zeros
MeanConcavePoints has 13 (2.3%) zeros Zeros

Reproduction

Analysis started2020-11-09 01:06:27.077441
Analysis finished2020-11-09 01:06:50.251398
Duration23.17 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Diagnosis
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.4 KiB
B
357 
M
212 
ValueCountFrequency (%) 
B35762.7%
 
M21237.3%
 
2020-11-08T22:06:50.370162image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-08T22:06:50.493983image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:50.619517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

MeanRadius
Real number (ℝ≥0)

HIGH CORRELATION

Distinct456
Distinct (%)80.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.12729174
Minimum6.981
Maximum28.11
Zeros0
Zeros (%)0.0%
Memory size4.4 KiB
2020-11-08T22:06:50.803686image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum6.981
5-th percentile9.5292
Q111.7
median13.37
Q315.78
95-th percentile20.576
Maximum28.11
Range21.129
Interquartile range (IQR)4.08

Descriptive statistics

Standard deviation3.524048826
Coefficient of variation (CV)0.2494497099
Kurtosis0.8455216229
Mean14.12729174
Median Absolute Deviation (MAD)1.9
Skewness0.9423795717
Sum8038.429
Variance12.41892013
MonotocityNot monotonic
2020-11-08T22:06:50.994664image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
12.3440.7%
 
12.7730.5%
 
15.4630.5%
 
12.8930.5%
 
13.0530.5%
 
11.7130.5%
 
13.8530.5%
 
11.8930.5%
 
10.2630.5%
 
12.1830.5%
 
Other values (446)53894.6%
 
ValueCountFrequency (%) 
6.98110.2%
 
7.69110.2%
 
7.72910.2%
 
7.7610.2%
 
8.19610.2%
 
ValueCountFrequency (%) 
28.1110.2%
 
27.4210.2%
 
27.2210.2%
 
25.7310.2%
 
25.2210.2%
 

MeanTexture
Real number (ℝ≥0)

Distinct479
Distinct (%)84.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.28964851
Minimum9.71
Maximum39.28
Zeros0
Zeros (%)0.0%
Memory size4.4 KiB
2020-11-08T22:06:51.207278image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum9.71
5-th percentile13.088
Q116.17
median18.84
Q321.8
95-th percentile27.15
Maximum39.28
Range29.57
Interquartile range (IQR)5.63

Descriptive statistics

Standard deviation4.301035768
Coefficient of variation (CV)0.2229711841
Kurtosis0.7583189724
Mean19.28964851
Median Absolute Deviation (MAD)2.81
Skewness0.6504495421
Sum10975.81
Variance18.49890868
MonotocityNot monotonic
2020-11-08T22:06:51.399555image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
14.9330.5%
 
15.730.5%
 
18.930.5%
 
16.8430.5%
 
17.4630.5%
 
18.2230.5%
 
20.5230.5%
 
16.8530.5%
 
19.8330.5%
 
18.8920.4%
 
Other values (469)54094.9%
 
ValueCountFrequency (%) 
9.7110.2%
 
10.3810.2%
 
10.7210.2%
 
10.8210.2%
 
10.8910.2%
 
ValueCountFrequency (%) 
39.2810.2%
 
33.8110.2%
 
33.5610.2%
 
32.4710.2%
 
31.1210.2%
 

MeanPerimeter
Real number (ℝ≥0)

HIGH CORRELATION

Distinct522
Distinct (%)91.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean91.96903339
Minimum43.79
Maximum188.5
Zeros0
Zeros (%)0.0%
Memory size4.4 KiB
2020-11-08T22:06:51.595419image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum43.79
5-th percentile60.496
Q175.17
median86.24
Q3104.1
95-th percentile135.82
Maximum188.5
Range144.71
Interquartile range (IQR)28.93

Descriptive statistics

Standard deviation24.29898104
Coefficient of variation (CV)0.2642082899
Kurtosis0.9722135477
Mean91.96903339
Median Absolute Deviation (MAD)12.71
Skewness0.9906504254
Sum52330.38
Variance590.4404795
MonotocityNot monotonic
2020-11-08T22:06:51.796752image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
82.6130.5%
 
134.730.5%
 
87.7630.5%
 
13020.4%
 
58.7920.4%
 
133.820.4%
 
85.9820.4%
 
113.420.4%
 
81.3520.4%
 
84.0820.4%
 
Other values (512)54696.0%
 
ValueCountFrequency (%) 
43.7910.2%
 
47.9210.2%
 
47.9810.2%
 
48.3410.2%
 
51.7110.2%
 
ValueCountFrequency (%) 
188.510.2%
 
186.910.2%
 
182.110.2%
 
174.210.2%
 
171.510.2%
 

MeanArea
Real number (ℝ≥0)

HIGH CORRELATION

Distinct539
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean654.8891037
Minimum143.5
Maximum2501
Zeros0
Zeros (%)0.0%
Memory size4.4 KiB
2020-11-08T22:06:52.006894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum143.5
5-th percentile275.78
Q1420.3
median551.1
Q3782.7
95-th percentile1309.8
Maximum2501
Range2357.5
Interquartile range (IQR)362.4

Descriptive statistics

Standard deviation351.9141292
Coefficient of variation (CV)0.5373644594
Kurtosis3.652302762
Mean654.8891037
Median Absolute Deviation (MAD)153.3
Skewness1.645732176
Sum372631.9
Variance123843.5543
MonotocityNot monotonic
2020-11-08T22:06:52.209484image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
512.230.5%
 
121420.4%
 
399.820.4%
 
758.620.4%
 
107520.4%
 
372.720.4%
 
684.520.4%
 
716.620.4%
 
113820.4%
 
658.820.4%
 
Other values (529)54896.3%
 
ValueCountFrequency (%) 
143.510.2%
 
170.410.2%
 
178.810.2%
 
18110.2%
 
201.910.2%
 
ValueCountFrequency (%) 
250110.2%
 
249910.2%
 
225010.2%
 
201010.2%
 
187810.2%
 

MeanSmoothness
Real number (ℝ≥0)

Distinct474
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0963602812
Minimum0.05263
Maximum0.1634
Zeros0
Zeros (%)0.0%
Memory size4.4 KiB
2020-11-08T22:06:52.402308image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.05263
5-th percentile0.075042
Q10.08637
median0.09587
Q30.1053
95-th percentile0.11878
Maximum0.1634
Range0.11077
Interquartile range (IQR)0.01893

Descriptive statistics

Standard deviation0.01406412814
Coefficient of variation (CV)0.1459535813
Kurtosis0.8559749304
Mean0.0963602812
Median Absolute Deviation (MAD)0.0095
Skewness0.4563237648
Sum54.829
Variance0.0001977997003
MonotocityNot monotonic
2020-11-08T22:06:52.599596image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.100750.9%
 
0.107540.7%
 
0.105440.7%
 
0.11540.7%
 
0.108930.5%
 
0.103730.5%
 
0.0946230.5%
 
0.104930.5%
 
0.0851130.5%
 
0.106630.5%
 
Other values (464)53493.8%
 
ValueCountFrequency (%) 
0.0526310.2%
 
0.0625110.2%
 
0.0642910.2%
 
0.0657610.2%
 
0.0661310.2%
 
ValueCountFrequency (%) 
0.163410.2%
 
0.144710.2%
 
0.142510.2%
 
0.139810.2%
 
0.137110.2%
 

MeanCompactness
Real number (ℝ≥0)

Distinct537
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1043409842
Minimum0.01938
Maximum0.3454
Zeros0
Zeros (%)0.0%
Memory size4.4 KiB
2020-11-08T22:06:52.810121image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.01938
5-th percentile0.04066
Q10.06492
median0.09263
Q30.1304
95-th percentile0.2087
Maximum0.3454
Range0.32602
Interquartile range (IQR)0.06548

Descriptive statistics

Standard deviation0.05281275793
Coefficient of variation (CV)0.5061554512
Kurtosis1.650130467
Mean0.1043409842
Median Absolute Deviation (MAD)0.03263
Skewness1.190123031
Sum59.37002
Variance0.0027891874
MonotocityNot monotonic
2020-11-08T22:06:53.011133image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.120630.5%
 
0.114730.5%
 
0.0499420.4%
 
0.133920.4%
 
0.126720.4%
 
0.208720.4%
 
0.0772220.4%
 
0.0769820.4%
 
0.0383420.4%
 
0.159920.4%
 
Other values (527)54796.1%
 
ValueCountFrequency (%) 
0.0193810.2%
 
0.0234410.2%
 
0.026510.2%
 
0.0267510.2%
 
0.0311610.2%
 
ValueCountFrequency (%) 
0.345410.2%
 
0.311410.2%
 
0.286710.2%
 
0.283910.2%
 
0.283210.2%
 

MeanConcavity
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct537
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08879931582
Minimum0
Maximum0.4268
Zeros13
Zeros (%)2.3%
Memory size4.4 KiB
2020-11-08T22:06:53.239699image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.0049826
Q10.02956
median0.06154
Q30.1307
95-th percentile0.24302
Maximum0.4268
Range0.4268
Interquartile range (IQR)0.10114

Descriptive statistics

Standard deviation0.07971980871
Coefficient of variation (CV)0.8977525105
Kurtosis1.998637529
Mean0.08879931582
Median Absolute Deviation (MAD)0.04046
Skewness1.401179739
Sum50.5268107
Variance0.0063552479
MonotocityNot monotonic
2020-11-08T22:06:53.429240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0132.3%
 
0.120430.5%
 
0.0134220.4%
 
0.0334420.4%
 
0.0268820.4%
 
0.0199720.4%
 
0.0197220.4%
 
0.197420.4%
 
0.0672620.4%
 
0.108520.4%
 
Other values (527)53794.4%
 
ValueCountFrequency (%) 
0132.3%
 
0.00069210.2%
 
0.000973710.2%
 
0.00119410.2%
 
0.00146110.2%
 
ValueCountFrequency (%) 
0.426810.2%
 
0.426410.2%
 
0.410810.2%
 
0.375410.2%
 
0.363510.2%
 

MeanConcavePoints
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct542
Distinct (%)95.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.04891914587
Minimum0
Maximum0.2012
Zeros13
Zeros (%)2.3%
Memory size4.4 KiB
2020-11-08T22:06:53.626544image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.0056208
Q10.02031
median0.0335
Q30.074
95-th percentile0.12574
Maximum0.2012
Range0.2012
Interquartile range (IQR)0.05369

Descriptive statistics

Standard deviation0.03880284486
Coefficient of variation (CV)0.7932036459
Kurtosis1.066555703
Mean0.04891914587
Median Absolute Deviation (MAD)0.02014
Skewness1.171180081
Sum27.834994
Variance0.001505660769
MonotocityNot monotonic
2020-11-08T22:06:53.839369image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0132.3%
 
0.0286430.5%
 
0.124220.4%
 
0.0525220.4%
 
0.0577820.4%
 
0.0203120.4%
 
0.0192420.4%
 
0.0161520.4%
 
0.0259420.4%
 
0.147120.4%
 
Other values (532)53794.4%
 
ValueCountFrequency (%) 
0132.3%
 
0.00185210.2%
 
0.00240410.2%
 
0.00292410.2%
 
0.00294110.2%
 
ValueCountFrequency (%) 
0.201210.2%
 
0.191310.2%
 
0.187810.2%
 
0.184510.2%
 
0.182310.2%
 

MeanSymmetry
Real number (ℝ≥0)

Distinct432
Distinct (%)75.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1811618629
Minimum0.106
Maximum0.304
Zeros0
Zeros (%)0.0%
Memory size4.4 KiB
2020-11-08T22:06:54.048635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.106
5-th percentile0.1415
Q10.1619
median0.1792
Q30.1957
95-th percentile0.23072
Maximum0.304
Range0.198
Interquartile range (IQR)0.0338

Descriptive statistics

Standard deviation0.02741428134
Coefficient of variation (CV)0.1513247926
Kurtosis1.287932992
Mean0.1811618629
Median Absolute Deviation (MAD)0.0171
Skewness0.7256089734
Sum103.0811
Variance0.0007515428212
MonotocityNot monotonic
2020-11-08T22:06:54.251938image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.171440.7%
 
0.176940.7%
 
0.189340.7%
 
0.171740.7%
 
0.160140.7%
 
0.211630.5%
 
0.19330.5%
 
0.161930.5%
 
0.186130.5%
 
0.17230.5%
 
Other values (422)53493.8%
 
ValueCountFrequency (%) 
0.10610.2%
 
0.116710.2%
 
0.120310.2%
 
0.121510.2%
 
0.12210.2%
 
ValueCountFrequency (%) 
0.30410.2%
 
0.290610.2%
 
0.274310.2%
 
0.267810.2%
 
0.265510.2%
 

MeanFractalDimension
Real number (ℝ≥0)

Distinct499
Distinct (%)87.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06279760984
Minimum0.04996
Maximum0.09744
Zeros0
Zeros (%)0.0%
Memory size4.4 KiB
2020-11-08T22:06:54.457985image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0.04996
5-th percentile0.053926
Q10.0577
median0.06154
Q30.06612
95-th percentile0.07609
Maximum0.09744
Range0.04748
Interquartile range (IQR)0.00842

Descriptive statistics

Standard deviation0.007060362795
Coefficient of variation (CV)0.1124304382
Kurtosis3.00589212
Mean0.06279760984
Median Absolute Deviation (MAD)0.00422
Skewness1.304488813
Sum35.73184
Variance4.98487228e-05
MonotocityNot monotonic
2020-11-08T22:06:54.892983image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.0566730.5%
 
0.0611330.5%
 
0.0591330.5%
 
0.0678230.5%
 
0.0590730.5%
 
0.0571520.4%
 
0.0675820.4%
 
0.0633120.4%
 
0.0586620.4%
 
0.0628420.4%
 
Other values (489)54495.6%
 
ValueCountFrequency (%) 
0.0499610.2%
 
0.0502410.2%
 
0.0502510.2%
 
0.0504410.2%
 
0.0505410.2%
 
ValueCountFrequency (%) 
0.0974410.2%
 
0.0957510.2%
 
0.0950210.2%
 
0.0929610.2%
 
0.089810.2%
 

Interactions

2020-11-08T22:06:27.871824image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:28.053593image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:28.282243image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:28.485652image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:28.688670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:28.892096image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:29.083541image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:29.265471image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:29.471613image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:29.697893image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:29.888759image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:30.108658image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:30.322969image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:30.571461image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:30.809597image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:31.045715image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:31.302459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:31.530411image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:31.783070image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:32.046025image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:32.263249image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:32.491645image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:32.744910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:33.007838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:33.228101image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:33.486319image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:33.764251image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:34.013192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:34.332550image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:34.649910image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:34.889579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:35.115201image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:35.358126image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:35.631497image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:35.899228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:36.116148image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:36.311392image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:36.515831image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:36.741147image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:36.991245image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:37.205805image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:37.524431image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:37.722921image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:37.939081image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:38.141380image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:38.348459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:38.561413image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:38.758648image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:38.988117image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:39.200579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:39.401015image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:39.570705image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:39.783536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:40.013819image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:40.216048image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:40.995663image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:41.223704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:41.445205image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:41.673517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:41.888792image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:42.071978image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:42.233357image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:42.404954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:42.578183image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:42.752053image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:42.930917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:43.101935image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:43.255211image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:43.424883image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:43.585265image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:43.746579image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:43.928019image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:44.121511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:44.326057image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:44.496602image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:44.672106image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:44.882256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:45.072808image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:45.271271image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:45.481219image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:45.671641image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:45.878965image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:46.080344image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:46.270438image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:46.449875image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:46.634214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:46.836286image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:47.015161image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:47.219494image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:47.418731image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:47.588021image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:47.750690image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:47.914896image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:48.103458image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:48.288274image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:48.453179image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:48.619828image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:48.962448image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:49.127564image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:49.323577image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2020-11-08T22:06:55.061561image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-11-08T22:06:55.333750image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-11-08T22:06:55.597583image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-11-08T22:06:55.885568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-11-08T22:06:49.651265image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2020-11-08T22:06:50.061838image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Sample

First rows

DiagnosisMeanRadiusMeanTextureMeanPerimeterMeanAreaMeanSmoothnessMeanCompactnessMeanConcavityMeanConcavePointsMeanSymmetryMeanFractalDimension
0M17.9910.38122.801001.00.118400.277600.300100.147100.24190.07871
1M20.5717.77132.901326.00.084740.078640.086900.070170.18120.05667
2M19.6921.25130.001203.00.109600.159900.197400.127900.20690.05999
3M11.4220.3877.58386.10.142500.283900.241400.105200.25970.09744
4M20.2914.34135.101297.00.100300.132800.198000.104300.18090.05883
5M12.4515.7082.57477.10.127800.170000.157800.080890.20870.07613
6M18.2519.98119.601040.00.094630.109000.112700.074000.17940.05742
7M13.7120.8390.20577.90.118900.164500.093660.059850.21960.07451
8M13.0021.8287.50519.80.127300.193200.185900.093530.23500.07389
9M12.4624.0483.97475.90.118600.239600.227300.085430.20300.08243

Last rows

DiagnosisMeanRadiusMeanTextureMeanPerimeterMeanAreaMeanSmoothnessMeanCompactnessMeanConcavityMeanConcavePointsMeanSymmetryMeanFractalDimension
559B11.5123.9374.52403.50.092610.102100.111200.041050.13880.06570
560B14.0527.1591.38600.40.099290.112600.044620.043040.15370.06171
561B11.2029.3770.67386.00.074490.035580.000000.000000.10600.05502
562M15.2230.62103.40716.90.104800.208700.255000.094290.21280.07152
563M20.9225.09143.001347.00.109900.223600.317400.147400.21490.06879
564M21.5622.39142.001479.00.111000.115900.243900.138900.17260.05623
565M20.1328.25131.201261.00.097800.103400.144000.097910.17520.05533
566M16.6028.08108.30858.10.084550.102300.092510.053020.15900.05648
567M20.6029.33140.101265.00.117800.277000.351400.152000.23970.07016
568B7.7624.5447.92181.00.052630.043620.000000.000000.15870.05884